Fix K-means k divergence: preserve vote-encounter row order#2524
Open
jucor wants to merge 1 commit into
Open
Conversation
This was referenced Mar 30, 2026
a9f2a48 to
5808e43
Compare
775510c to
ae240ea
Compare
1e0564f to
fbfa1d1
Compare
ballPointPenguin
approved these changes
Apr 26, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Aligns Delphi’s Python math pipeline with the Clojure reference implementation by preserving vote-encounter (insertion) order for participant rows in the rating matrix, eliminating ordering-driven divergence in group-level k-means initialization and k selection.
Changes:
- Preserve participant (row) ordering by first appearance in
Conversation.update_votes()and when filtering moderated participants in_apply_moderation(). - Update ordering-related unit tests and enable group clustering comparisons for cold-start blobs (xfail only for incremental blobs).
- Refresh the vw cold-start blob and update investigation/plan/journal documentation for the resolved k-divergence root cause.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| delphi/polismath/conversation/conversation.py | Preserves participant row encounter order in vote updates and moderation filtering; keeps columns natsorted. |
| delphi/tests/test_conversation.py | Updates tests to expect encounter order for participant rows while keeping comment columns natsorted. |
| delphi/tests/test_legacy_clojure_regression.py | Removes blanket xfail for clustering; xfails only for incremental blobs where comparison isn’t valid. |
| delphi/real_data/r6vbnhffkxbd7ifmfbdrd-vw/r6vbnhffkxbd7ifmfbdrd_math_blob_cold_start.json | Re-recorded blob to match updated pipeline outputs and ordering-dependent results. |
| delphi/docs/PLAN_DISCREPANCY_FIXES.md | Marks k-divergence fix as done and documents updated understanding/results. |
| delphi/docs/INVESTIGATION_K_DIVERGENCE.md | Adds resolved investigation writeup describing root cause, fix, and dataset outcomes. |
| delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md | Records investigation process/results and links to the investigation doc. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
## Summary - Fix K-means k divergence between Python and Clojure by preserving vote-encounter order for participant rows in the rating matrix - Python was using `natsorted()` (PID-numeric order) while Clojure's NamedMatrix preserves insertion order — different row ordering cascades into different first-k-distinct initialization seeds for group-level k-means - On vw: Python picked k=4 (wrong), Clojure picks k=2 — now both pick k=2 with identical cluster memberships ## Investigation findings The divergence chain: rating_mat row order → PCA projection order → base-cluster ID assignment → group k-means first-k-distinct init → different local optima → different silhouette landscape → different k. PCA components are identical (cosine similarity = 1.0), silhouette implementation matches, k-means algorithm matches — only the data ORDER feeding first-k-distinct differed. ## Changes - `conversation.py`: `update_votes()` preserves vote-encounter order for participant rows instead of `natsorted()` - `conversation.py`: `_apply_moderation()` preserves row order with list comprehension - Column (comment ID) ordering remains `natsorted` — doesn't affect clustering - Re-recorded vw cold-start blob and golden snapshots - Updated ordering tests, removed `test_group_clustering` xfail - Added `scripts/investigate_k_divergence.py` diagnostic tool ## Cold-start blob results | Dataset | Clj k | Py k | Match | |---------|-------|------|-------| | vw | 2 | 2 | exact (sizes [50,17]) | | biodiversity | 2 | 2 | exact (sizes [81,19]) | | bg2018 | 2 | 2 | close ([51,49] vs [52,48]) | | FLI | 2 | 3 | inherent PCA divergence (94.5% NaN, sil gap 0.001) | ## Test plan - [x] All 297 tests pass (0 failures, 58 xfailed) - [x] vw cold-start: k=2 exact match with Clojure blob - [x] biodiversity cold-start: k=2 exact match - [x] Ordering tests updated to expect encounter order - [ ] Re-record private dataset golden snapshots after stack rebase 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Squashed commits - Fix K-means k divergence: preserve vote-encounter order for participant rows - Update plan and journal: K-divergence investigation resolved - Remove investigation script (one-off diagnostic, not production code) - Rename k-divergence doc: investigation record, not a handoff - Update references to renamed investigation doc commit-id:4598a0a1
Delphi Coverage Report
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
natsorted()(PID-numeric order) while Clojure's NamedMatrix preserves insertion order — different row ordering cascades into different first-k-distinct initialization seeds for group-level k-meansInvestigation findings
The divergence chain: rating_mat row order → PCA projection order → base-cluster ID assignment → group k-means first-k-distinct init → different local optima → different silhouette landscape → different k.
PCA components are identical (cosine similarity = 1.0), silhouette implementation matches, k-means algorithm matches — only the data ORDER feeding first-k-distinct differed.
Changes
conversation.py:update_votes()preserves vote-encounter order for participant rows instead ofnatsorted()conversation.py:_apply_moderation()preserves row order with list comprehensionnatsorted— doesn't affect clusteringtest_group_clusteringxfailscripts/investigate_k_divergence.pydiagnostic toolCold-start blob results
Test plan
🤖 Generated with Claude Code
Squashed commits
commit-id:4598a0a1
Stack: